10 research outputs found

    Optimización de la Entrada Salida mediante librerías y lenguajes paralelos

    Get PDF
    Uno de los grandes retos de la HPC (High Performance Computing) consiste en optimizar el subsistema de Entrada/Salida, (E/S), o I/O (Input/Output). Ken Batcher resume este hecho en la siguiente frase: "Un supercomputador es un dispositivo que convierte los problemas limitados por la potencia de cálculo en problemas limitados por la E/S" ("A Supercomputer is a device for turning compute-bound problems into I/O-bound problems") . En otras palabras, el cuello de botella ya no reside tanto en el procesamiento de los datos como en la disponibilidad de los mismos. Además, este problema se exacerbará con la llegada del Exascale y la popularización de las aplicaciones Big Data. En este contexto, esta tesis contribuye a mejorar el rendimiento y la facilidad de uso del subsistema de E/S de los sistemas de supercomputación. Principalmente se proponen dos contribuciones al respecto: i) una interfaz de E/S desarrollada para el lenguaje Chapel que mejora la productividad del programador a la hora de codificar las operaciones de E/S; y ii) una implementación optimizada del almacenamiento de datos de secuencias genéticas. Con más detalle, la primera contribución estudia y analiza distintas optimizaciones de la E/S en Chapel, al tiempo que provee a los usuarios de una interfaz simple para el acceso paralelo y distribuido a los datos contenidos en ficheros. Por tanto, contribuimos tanto a aumentar la productividad de los desarrolladores, como a que la implementación sea lo más óptima posible. La segunda contribución también se enmarca dentro de los problemas de E/S, pero en este caso se centra en mejorar el almacenamiento de los datos de secuencias genéticas, incluyendo su compresión, y en permitir un uso eficiente de esos datos por parte de las aplicaciones existentes, permitiendo una recuperación eficiente tanto de forma secuencial como aleatoria. Adicionalmente, proponemos una implementación paralela basada en Chapel

    miRNA as biomarker in lung cancer

    Get PDF
    Lung cancer has a high prevalence and mortality due to its late diagnosis and limited treatment, so it is essential to find biomarkers that allow a faster diagnosis and improve the survival of these patients. In this sense, biomarkers based on miRNAs have supposed a considerable advance. miRNAs, which are small RNA sequences, can regulate gene expression, so they play an essential role not only as a diagnostic biomarker but also as a therapeutic and prognostic one. Also, miRNA biomarkers can be obtained from liquid biopsies, which are less intrusive than lung biopsies, and have better accessibil-ity, safety and repeatability, which allows using those biomarkers both for diagnosis and monitoring of patients. In this review, we highlight the importance of miRNAs and collect the existing evidence of their relationship with lung cancer.Funding for open access charge: Universidad de Málaga / CBUA Funding for open access publishing: Universidad Málaga / CBU

    Whole-Genome Assembly: An Experimental Study of Computational Costs and Architectural Opportunities

    Get PDF
    Whole-genome sequencing (WGS) pro- vides a huge amount of reads from which a comple- te genome could be assembled. The recent advent of long read sequencing technologies, such as PacBio and Oxford Nanopore, and the subsequent appearance of high quality long reads (single molecule high-fidelity, or HiFi) have improved the scaffolding of the genome. However, both biology and computing communities still face great challenges in terms of computational cost. Thus, it is essential a high precision characte- rization of the methods for a correct identification of the main computing bottlenecks. This study will allow us to design new methods to mitigate compu- tational costs without losing accuracy and to adapt such methods to fully exploit new architectures that provide support to handle big amounts of data. In this paper, we experimentally study and characterize the most used whole-genome assemblers in order to design new approaches in this field.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Comparing assembly strategies for third-generation sequencing technologies across different genomes

    Get PDF
    The recent advent of long-read sequencing technologies, such as Pacific Biosciences (PacBio) and Oxford Nanopore technology (ONT), has led to substantial accuracy and computational cost improvements. However, de novo whole-genome assembly still presents significant challenges related to the computational cost and the quality of the results. Accordingly, sequencing accuracy and throughput continue to improve, and many tools are constantly emerging. Therefore, selecting the correct sequencing platform, the proper sequencing depth and the assembly tools are necessary to perform high-quality assembly. This paper evaluates the primary assembly reconstruction from recent hybrid and non-hybrid pipelines on different genomes. We find that using PacBio high-fidelity long-read (HiFi) plays an essential role in haplotype construction with respect to ONT reads. However, we observe a substantial improvement in the correctness of the assembly from high-fidelity ONT datasets and combining it with HiFi or short-reads.Funding for open access charge: Universidad de Málaga / CBU

    Comparing assembly strategies for third-generation sequencing technologies across different genomes

    Get PDF
    The recent advent of long-read sequencing technologies, such as Pacific Biosciences (PacBio) and Oxford Nanopore technology (ONT), has led to substantial accuracy and computational cost improvements. However, de novo whole-genome assembly still presents significant challenges related to the computational cost and the quality of the results. Accordingly, sequencing accuracy and throughput continue to improve, and many tools are constantly emerging. Therefore, selecting the correct sequencing platform, the proper sequencing depth and the assembly tools are necessary to perform high-quality assembly. This paper evaluates the primary assembly reconstruction from recent hybrid and non-hybrid pipelines on different genomes. We find that using PacBio high-fidelity long-read (HiFi) plays an essential role in haplotype construction with respect to ONT reads. However, we observe a substantial improvement in the correctness of the assembly from high-fidelity ONT datasets and combining it with HiFi or short-reads.This work has been partially supported by the Spanish MINECO PID2019-105396RB-I00, Junta de Andalucia JA2018 P18-FR-3433, and UMA18-FEDERJA-197 projects. Funding for open access charge: Universidad de Málaga/CBUA.Peer ReviewedPostprint (published version

    CARB-ES-19 Multicenter Study of Carbapenemase-Producing Klebsiella pneumoniae and Escherichia coli From All Spanish Provinces Reveals Interregional Spread of High-Risk Clones Such as ST307/OXA-48 and ST512/KPC-3

    Get PDF
    ObjectivesCARB-ES-19 is a comprehensive, multicenter, nationwide study integrating whole-genome sequencing (WGS) in the surveillance of carbapenemase-producing K. pneumoniae (CP-Kpn) and E. coli (CP-Eco) to determine their incidence, geographical distribution, phylogeny, and resistance mechanisms in Spain.MethodsIn total, 71 hospitals, representing all 50 Spanish provinces, collected the first 10 isolates per hospital (February to May 2019); CPE isolates were first identified according to EUCAST (meropenem MIC > 0.12 mg/L with immunochromatography, colorimetric tests, carbapenem inactivation, or carbapenem hydrolysis with MALDI-TOF). Prevalence and incidence were calculated according to population denominators. Antibiotic susceptibility testing was performed using the microdilution method (EUCAST). All 403 isolates collected were sequenced for high-resolution single-nucleotide polymorphism (SNP) typing, core genome multilocus sequence typing (cgMLST), and resistome analysis.ResultsIn total, 377 (93.5%) CP-Kpn and 26 (6.5%) CP-Eco isolates were collected from 62 (87.3%) hospitals in 46 (92%) provinces. CP-Kpn was more prevalent in the blood (5.8%, 50/853) than in the urine (1.4%, 201/14,464). The cumulative incidence for both CP-Kpn and CP-Eco was 0.05 per 100 admitted patients. The main carbapenemase genes identified in CP-Kpn were blaOXA–48 (263/377), blaKPC–3 (62/377), blaVIM–1 (28/377), and blaNDM–1 (12/377). All isolates were susceptible to at least two antibiotics. Interregional dissemination of eight high-risk CP-Kpn clones was detected, mainly ST307/OXA-48 (16.4%), ST11/OXA-48 (16.4%), and ST512-ST258/KPC (13.8%). ST512/KPC and ST15/OXA-48 were the most frequent bacteremia-causative clones. The average number of acquired resistance genes was higher in CP-Kpn (7.9) than in CP-Eco (5.5).ConclusionThis study serves as a first step toward WGS integration in the surveillance of carbapenemase-producing Enterobacterales in Spain. We detected important epidemiological changes, including increased CP-Kpn and CP-Eco prevalence and incidence compared to previous studies, wide interregional dissemination, and increased dissemination of high-risk clones, such as ST307/OXA-48 and ST512/KPC-3

    MOESM6 of Automated identification of reference genes based on RNA-seq data

    Get PDF
    Additional file 6. Best candidate RGs for normal and malignant lung samples according to Fig. 6b, ranked by CV. They were obtained with CV < 20% and minimum counted reads of 10,000. Transcript_id: human transcript identifiers in ENSEMBL database

    MOESM2 of Automated identification of reference genes based on RNA-seq data

    No full text
    Additional file 2. Best RGs in olive tree pistil according to Fig. 2a, ranked by CV. They were obtained for different stages of pistil development with CV < 10% and minimum counted reads of 100. Transcript_id: transcript identifiers in the ReprOlive transcriptome

    Characteristics and predictors of death among 4035 consecutively hospitalized patients with COVID-19 in Spain

    No full text
    corecore